Kim, Su Nam and Timothy Baldwin (2008) An Unsupervised Approach to Interpreting Noun Compounds, In Proceedings of 2008 IEEE International Conference on Natural Language Processing and Knowledge Engineering (IEEE NLP-KE'08), Beijing, China
نویسندگان
چکیده
This paper proposes an unsupervised approach to automatically interpret noun compounds using semantic similarity. Our proposed unsupervised method is based on obtaining a large amount of robust evidence for NC interpretation. In order to obtain evidence sentences for semantic relations (SRs), we first acquired sentences containing both a head noun and its modifier in the form of SR definitions. Then we determined the semantic relations represented in the sentences by looking at the nouns in the test instances (noun mapping) and verbs in the SR definitions (verb mapping). In the noun mapping, we measured the similarity between nouns in test instances and nouns in the collected sentences. In the verb mapping, we mapped the verbs of sentences onto those in the SR definitions. Finally, we built a statistical classifier to interpret noun compounds and evaluated it over 17 SRs defined in [1].
منابع مشابه
Kim, Su Nam and Timothy Baldwin (2008) Benchmarking Noun Compound Interpretation, In Proceedings of the Third International Joint Conference on Natural Language Processing (IJCNLP 2008), Hyderabad, India
In this paper we provide benchmark results for two classes of methods used in interpreting noun compounds (NCs): semantic similarity-based methods and their hybrids. We evaluate the methods using 7-way and binary class data from the nominal pair interpretation task of SEMEVAL-2007.1 We summarize and analyse our results, with the intention of providing a framework for benchmarking future researc...
متن کاملKim, Su Nam and Timothy Baldwin (to appear) Word Sense Disambiguation and Noun Compounds, ACM Transactions on Speech and Language Processing
In this paper, we investigate word sense distributions in noun compounds (NCs). Our primary goal is to disambiguate the word sense of component words in NCs, based on investigation of “semantic collocation” between them. We use sense collocation and lexical substitution to build supervised and unsupervised word sense disambiguation (WSD) classifiers, and show our unsupervised learner to be supe...
متن کاملKim, Su Nam and Timothy Baldwin (2013) A Lexical Semantic Approach to Interpreting and Bracketing English Noun Compounds, Natural Language Engineering 19(3), pp. 385-407
This paper presents a study on the interpretation and bracketing of noun compounds (“NCs”), based on lexical semantics. Our primary goal is to develop a method to automatically interpret NCs through the use of semantic relations. Our NC interpretation method is based on lexical similarity with tagged NCs, based on lexical similarity measures derived fromWordNet. We apply the interpretation meth...
متن کاملBaldwin, Timothy, Su Nam Kim, Francis Bond, Sanae Fujita, David Martinez and Takaaki Tanaka (2008) MRD-based Word Sense Disambiguation: Further Extending Lesk, In Proceedings of the Third International Joint Conference on Natural Language Processing (IJCNLP 2008), Hyderabad, India
This paper reconsiders the task of MRDbased word sense disambiguation, in extending the basic Lesk algorithm to investigate the impact onWSD performance of different tokenisation schemes, scoring mechanisms, methods of gloss extension and filtering methods. In experimentation over the Lexeed Sensebank and the Japanese Senseval2 dictionary task, we demonstrate that character bigrams with sense-s...
متن کاملAn Unsupervised Approach to Domain-Specific Term Extraction
Domain-specific terms provide vital semantic information for many natural language processing (NLP) tasks and applications, but remain a largely untapped resource in the field. In this paper, we propose an unsupervised method to extract domain-specific terms from the Reuters document collection using term frequency and inverse document frequency.
متن کامل